HTSZ_CEM System for Chemical Entity Mention Recognition in Patents
نویسندگان
چکیده
In this paper, a machine learning-based system was proposed for the challenge task of chemical entity mention recognition in patents (CEMP) in BioCreative V. The CEMP task was recognized as a sequence labeling problem and conditional random fields (CRF) were employed for it. Evaluation on the CEMP challenge corpus showed that our system (team 293) achieved a micro F-measure of 87.03%.
منابع مشابه
Adapting ChER for the recognition of chemical mentions in patents
ChER (Chemical Entity Recogniser) is a pipeline of natural language processing tools optimised for the recognition of chemical names in scientific abstracts. It formed the basis of our submissions to the previous edition of the CHEMDNER track in BioCreative IV, and was one of the top-performing systems both for the chemical document indexing (CDI) and chemical entity mention recognition (CEM) s...
متن کاملChemical entity recognition in patents by combining dictionary-based and statistical approaches
We describe the development of a chemical entity recognition system and its application in the CHEMDNER-patent track of BioCreative 2015. This community challenge includes a Chemical Entity Mention in Patents (CEMP) recognition task and a Chemical Passage Detection (CPD) classification task. We addressed both tasks by an ensemble system that combines a dictionary-based approach with a statistic...
متن کاملHITextracter System for Chemical and Gene/Protein Entity Mention Recognition in Patents
In this paper, a hybrid system was proposed for chemical entity mention recognition (CEMP) and gene/protein related object recognition (GPRO) in BeCalm challenge. Firstly, five individual machine learning-based subsystems were developed to identify chemical and gene/protein related entity mentions, that is, a bidirectional LSTM (long-short term memory, a variant of recurrent neural network)-bas...
متن کاملNERChem: adapting NERBio to chemical patents via full-token features and named entity feature with chemical sub-class composition
Chemical patents contain detailed information on novel chemical compounds that is valuable to the chemical and pharmaceutical industries. In this paper, we introduce a system, NERChem that can recognize chemical named entity mentions in chemical patents. NERChem is based on the conditional random fields model (CRF). Our approach incorporates (1) class composition, which is used for combining ch...
متن کاملDUTIR at the BioCreative V.5.BeCalm Tasks: A BLSTM-CRF Approach for Biomedical Entity Recognition in Patents
Patents contain the significant amount of information. Biomedical text mining has received much attention in patents recently, especially in the medicinal chemistry domain. The BioCreative V.5.BeCalm tasks focus on biomedical entities recognition in patents. This paper describes our method used to create our submissions to the Chemical Entity Mention recognition (CEMP) and Gene and Protein Rela...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015